Candide: A Statistical Machine Translation System
نویسندگان
چکیده
The Candide project has two objectives. First, we want to develop a fully-automatic, large vocabulary, French-to-English translation system. Second, we want to develop an interactive translator 's workstation that will increase the speed and productivity of a human translator. The philosophy of the project is to combine, within a probabilistic framework, both statistical information acquired automatically from bilingual corpora and linguistic knowledge provided by human experts.
منابع مشابه
A Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملThe Candide System for Machine Translation
We present an overview of Candide, a system for automatic translat ion of French text to English text. Candide uses methods of information theory and statistics to develop a probabili ty model of the translation process. This model, which is made to accord as closely as possible with a large body of French and English sentence pairs, is then used to generate English translations of previously u...
متن کاملبهبود و توسعه یک سیستم مترجمیار انگلیسی به فارسی
In recent years, significant improvements have been achieved in statistical machine translation (SMT), but still even the best machine translation technology is far from replacing or even competing with human translators. Another way to increase the productivity of the translation process is computer-assisted translation (CAT) system. In a CAT system, the human translator begins to type the tra...
متن کاملToward a Scoring Function for Quality-Driven Machine Translation
We describe how we constructed an automatic scoring function for machine translation quality; this function makes use of arbitrarily many pieces of natural language processing software that has been designed to process English language text. By machine-learning values of fnnctions available inside the software and by constructing functions that yield values based upon the software output, we ar...
متن کاملA new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کامل